DataFrames Julia Introduction Tutorial Solution#
This is one possible solution to the data frames exercise tutorial.
using DataFrames
using CSV
[ Info: Precompiling DataFrames [a93c6f00-e57d-5684-b7b6-d8193f3e46c0]
higgs_ml = CSV.read(joinpath("..", "assets", "atlas-higgs-challenge-2014-v2-reduced.csv"), DataFrame)
50000×35 DataFrame
49975 rows omitted
| Row | EventId | DER_mass_MMC | DER_mass_transverse_met_lep | DER_mass_vis | DER_pt_h | DER_deltaeta_jet_jet | DER_mass_jet_jet | DER_prodeta_jet_jet | DER_deltar_tau_lep | DER_pt_tot | DER_sum_pt | DER_pt_ratio_lep_tau | DER_met_phi_centrality | DER_lep_eta_centrality | PRI_tau_pt | PRI_tau_eta | PRI_tau_phi | PRI_lep_pt | PRI_lep_eta | PRI_lep_phi | PRI_met | PRI_met_phi | PRI_met_sumet | PRI_jet_num | PRI_jet_leading_pt | PRI_jet_leading_eta | PRI_jet_leading_phi | PRI_jet_subleading_pt | PRI_jet_subleading_eta | PRI_jet_subleading_phi | PRI_jet_all_pt | Weight | Label | KaggleSet | KaggleWeight |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Int64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Int64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | String1 | String1 | Float64 | |
| 1 | 100000 | 138.47 | 51.655 | 97.827 | 27.98 | 0.91 | 124.711 | 2.666 | 3.064 | 41.928 | 197.76 | 1.582 | 1.396 | 0.2 | 32.638 | 1.017 | 0.381 | 51.626 | 2.273 | -2.414 | 16.824 | -0.277 | 258.733 | 2 | 67.435 | 2.15 | 0.444 | 46.062 | 1.24 | -2.475 | 113.497 | 0.00081448 | s | t | 0.00265331 |
| 2 | 100001 | 160.937 | 68.768 | 103.235 | 48.146 | -999.0 | -999.0 | -999.0 | 3.473 | 2.078 | 125.157 | 0.879 | 1.414 | -999.0 | 42.014 | 2.039 | -3.011 | 36.918 | 0.501 | 0.103 | 44.704 | -1.916 | 164.546 | 1 | 46.226 | 0.725 | 1.158 | -999.0 | -999.0 | -999.0 | 46.226 | 0.681042 | b | t | 2.23358 |
| 3 | 100002 | -999.0 | 162.172 | 125.953 | 35.635 | -999.0 | -999.0 | -999.0 | 3.148 | 9.336 | 197.814 | 3.776 | 1.414 | -999.0 | 32.154 | -0.705 | -2.093 | 121.409 | -0.953 | 1.052 | 54.283 | -2.186 | 260.414 | 1 | 44.251 | 2.053 | -2.028 | -999.0 | -999.0 | -999.0 | 44.251 | 0.715742 | b | t | 2.34739 |
| 4 | 100003 | 143.905 | 81.417 | 80.943 | 0.414 | -999.0 | -999.0 | -999.0 | 3.31 | 0.414 | 75.968 | 2.354 | -1.285 | -999.0 | 22.647 | -1.655 | 0.01 | 53.321 | -0.522 | -3.1 | 31.082 | 0.06 | 86.062 | 0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -0.0 | 1.66065 | b | t | 5.44638 |
| 5 | 100004 | 175.864 | 16.915 | 134.805 | 16.405 | -999.0 | -999.0 | -999.0 | 3.891 | 16.405 | 57.983 | 1.056 | -1.385 | -999.0 | 28.209 | -2.197 | -2.231 | 29.774 | 0.798 | 1.569 | 2.723 | -0.871 | 53.131 | 0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 | 1.90426 | b | t | 6.24533 |
| 6 | 100005 | 89.744 | 13.55 | 59.149 | 116.344 | 2.636 | 284.584 | -0.54 | 1.362 | 61.619 | 278.876 | 0.588 | 0.479 | 0.975 | 53.651 | 0.371 | 1.329 | 31.565 | -0.884 | 1.857 | 40.735 | 2.237 | 282.849 | 3 | 90.547 | -2.412 | -0.653 | 56.165 | 0.224 | 3.106 | 193.66 | 0.0254338 | b | t | 0.083414 |
| 7 | 100006 | 148.754 | 28.862 | 107.782 | 106.13 | 0.733 | 158.359 | 0.113 | 2.941 | 2.545 | 305.967 | 3.371 | 1.393 | 0.791 | 28.85 | 1.113 | 2.409 | 97.24 | 0.675 | -0.966 | 38.421 | -1.443 | 294.074 | 2 | 123.01 | 0.864 | 1.45 | 56.867 | 0.131 | -2.767 | 179.877 | 0.00081448 | s | t | 0.00265331 |
| 8 | 100007 | 154.916 | 10.418 | 94.714 | 29.169 | -999.0 | -999.0 | -999.0 | 2.897 | 1.526 | 138.178 | 0.365 | -1.305 | -999.0 | 78.8 | 0.654 | 1.547 | 28.74 | 0.506 | -1.347 | 22.275 | -1.761 | 187.299 | 1 | 30.638 | -0.715 | -1.724 | -999.0 | -999.0 | -999.0 | 30.638 | 0.00572068 | s | t | 0.0186361 |
| 9 | 100008 | 105.594 | 50.559 | 100.989 | 4.288 | -999.0 | -999.0 | -999.0 | 2.904 | 4.288 | 65.333 | 0.675 | -1.366 | -999.0 | 39.008 | 2.433 | -2.532 | 26.325 | 0.21 | 1.884 | 37.791 | 0.024 | 129.804 | 0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 | 1.6148 | b | t | 5.296 |
| 10 | 100009 | 128.053 | 88.941 | 69.272 | 193.392 | -999.0 | -999.0 | -999.0 | 1.609 | 28.859 | 255.123 | 0.599 | 0.538 | -999.0 | 54.646 | -1.533 | 0.416 | 32.742 | -0.317 | -0.636 | 132.678 | 0.845 | 294.741 | 1 | 167.735 | -2.767 | -2.514 | -999.0 | -999.0 | -999.0 | 167.735 | 0.000461025 | s | t | 0.00150187 |
| 11 | 100010 | -999.0 | 86.24 | 79.692 | 27.201 | -999.0 | -999.0 | -999.0 | 2.338 | 27.201 | 81.734 | 1.75 | -1.412 | -999.0 | 29.718 | -0.866 | 2.878 | 52.016 | 0.126 | -1.288 | 51.276 | 0.688 | 250.178 | 0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 | 0.701141 | b | t | 2.2995 |
| 12 | 100011 | 114.744 | 10.286 | 75.712 | 30.816 | 2.563 | 252.599 | -1.401 | 2.888 | 36.745 | 239.804 | 1.061 | 1.364 | 0.769 | 35.976 | -0.669 | -0.342 | 38.188 | -0.165 | 2.502 | 22.385 | 2.148 | 290.547 | 3 | 76.773 | -0.79 | 0.303 | 56.876 | 1.773 | -2.079 | 165.64 | 0.093659 | b | t | 0.30717 |
| 13 | 100012 | 145.297 | 64.234 | 103.565 | 106.999 | -999.0 | -999.0 | -999.0 | 2.183 | 24.66 | 192.245 | 0.576 | 0.689 | -999.0 | 62.89 | -0.766 | -1.632 | 36.237 | 0.722 | -0.035 | 43.91 | -1.907 | 232.362 | 1 | 93.117 | -0.97 | 1.943 | -999.0 | -999.0 | -999.0 | 93.117 | 0.51274 | b | t | 1.68161 |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
| 49989 | 149988 | 91.594 | 70.574 | 57.615 | 34.452 | -999.0 | -999.0 | -999.0 | 2.843 | 3.009 | 89.138 | 1.227 | -1.401 | -999.0 | 25.443 | 1.4 | -1.228 | 31.219 | 1.939 | 2.264 | 45.691 | -0.149 | 111.575 | 1 | 32.476 | -1.057 | 2.877 | -999.0 | -999.0 | -999.0 | 32.476 | 0.617144 | b | t | 2.02402 |
| 49990 | 149989 | -999.0 | 68.957 | 76.191 | 2.304 | -999.0 | -999.0 | -999.0 | 2.32 | 2.304 | 74.611 | 1.072 | -1.414 | -999.0 | 36.01 | -1.651 | -0.452 | 38.601 | -0.594 | -2.517 | 40.181 | 1.633 | 96.391 | 0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 | 1.43486 | b | t | 4.70584 |
| 49991 | 149990 | 118.34 | 51.063 | 51.062 | 116.277 | 1.31 | 238.766 | 4.362 | 1.785 | 2.564 | 292.117 | 2.008 | 1.414 | 0.0 | 23.089 | 0.3 | 2.12 | 46.356 | 0.158 | -2.384 | 71.478 | 2.98 | 338.491 | 2 | 166.372 | -2.844 | -0.148 | 56.3 | -1.534 | 2.692 | 222.672 | 0.00572068 | s | t | 0.0186361 |
| 49992 | 149991 | 131.127 | 58.566 | 85.136 | 151.547 | -999.0 | -999.0 | -999.0 | 1.455 | 11.819 | 268.611 | 0.989 | 1.342 | -999.0 | 64.296 | -1.95 | -2.078 | 63.568 | -2.053 | -0.626 | 57.335 | -1.639 | 355.655 | 1 | 140.746 | 0.593 | 1.712 | -999.0 | -999.0 | -999.0 | 140.746 | 0.000461282 | s | t | 0.0015027 |
| 49993 | 149992 | -999.0 | 95.864 | 90.317 | 21.262 | -999.0 | -999.0 | -999.0 | 2.907 | 21.262 | 64.717 | 2.167 | -1.096 | -999.0 | 20.432 | 0.5 | 2.282 | 44.285 | -1.579 | 0.251 | 51.998 | -2.795 | 184.146 | 0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 | 0.832855 | b | t | 2.73148 |
| 49994 | 149993 | 99.452 | 81.13 | 73.539 | 70.79 | -999.0 | -999.0 | -999.0 | 2.142 | 41.225 | 159.922 | 2.991 | 1.194 | -999.0 | 24.108 | 1.616 | -2.424 | 72.108 | 1.407 | -0.292 | 34.003 | -2.212 | 216.834 | 1 | 63.705 | 2.884 | 2.632 | -999.0 | -999.0 | -999.0 | 63.705 | 0.000461282 | s | t | 0.0015027 |
| 49995 | 149994 | 59.653 | 67.648 | 50.783 | 24.749 | -999.0 | -999.0 | -999.0 | 1.621 | 24.749 | 72.522 | 1.717 | -1.414 | -999.0 | 26.691 | -1.067 | -2.872 | 45.83 | -0.939 | -1.256 | 29.411 | 1.087 | 177.077 | 0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 | 1.41038 | b | t | 4.62555 |
| 49996 | 149995 | 133.013 | 29.198 | 78.723 | 110.576 | 0.666 | 281.596 | -0.073 | 1.96 | 56.406 | 501.854 | 0.418 | 1.398 | 0.0 | 73.174 | 0.819 | 1.32 | 30.581 | 0.952 | -0.636 | 51.207 | 0.12 | 553.857 | 3 | 193.297 | 0.14 | -1.798 | 92.691 | -0.526 | 1.653 | 398.099 | 0.0254338 | b | t | 0.083414 |
| 49997 | 149996 | -999.0 | 125.733 | 58.863 | 174.734 | 0.828 | 171.433 | -0.139 | 1.216 | 34.733 | 305.929 | 2.234 | 0.007 | 0.062 | 30.498 | -0.34 | 1.375 | 68.136 | 0.869 | 1.245 | 134.87 | -0.186 | 409.819 | 2 | 149.11 | -0.234 | -2.235 | 58.184 | 0.594 | 2.188 | 207.294 | 0.720062 | b | t | 2.36156 |
| 49998 | 149997 | 128.498 | 18.588 | 69.903 | 54.601 | 3.932 | 666.91 | -3.56 | 3.025 | 2.339 | 261.035 | 0.844 | -1.398 | 0.559 | 38.094 | -0.936 | -3.106 | 32.158 | -0.948 | 0.152 | 61.561 | -0.269 | 348.625 | 2 | 122.3 | -1.414 | 3.043 | 68.483 | 2.518 | 0.046 | 190.783 | 0.000461282 | s | t | 0.0015027 |
| 49999 | 149998 | 151.113 | 70.106 | 93.991 | 4.145 | -999.0 | -999.0 | -999.0 | 3.4 | 4.145 | 79.236 | 1.711 | 1.405 | -999.0 | 29.225 | 0.746 | 0.521 | 50.01 | 2.074 | -2.609 | 24.589 | 0.476 | 116.539 | 0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 | 0.00572068 | s | t | 0.0186361 |
| 50000 | 149999 | 104.21 | 18.268 | 58.438 | 80.275 | -999.0 | -999.0 | -999.0 | 2.135 | 31.245 | 106.091 | 0.851 | 1.368 | -999.0 | 30.645 | 0.859 | 2.111 | 26.094 | -0.484 | 0.451 | 43.201 | 1.002 | 161.614 | 1 | 49.353 | -2.641 | -2.038 | -999.0 | -999.0 | -999.0 | 49.353 | 0.000461282 | s | t | 0.0015027 |
# To set missing values, these have to be allowed!
allowmissing!(higgs_ml)
50000×35 DataFrame
49975 rows omitted
| Row | EventId | DER_mass_MMC | DER_mass_transverse_met_lep | DER_mass_vis | DER_pt_h | DER_deltaeta_jet_jet | DER_mass_jet_jet | DER_prodeta_jet_jet | DER_deltar_tau_lep | DER_pt_tot | DER_sum_pt | DER_pt_ratio_lep_tau | DER_met_phi_centrality | DER_lep_eta_centrality | PRI_tau_pt | PRI_tau_eta | PRI_tau_phi | PRI_lep_pt | PRI_lep_eta | PRI_lep_phi | PRI_met | PRI_met_phi | PRI_met_sumet | PRI_jet_num | PRI_jet_leading_pt | PRI_jet_leading_eta | PRI_jet_leading_phi | PRI_jet_subleading_pt | PRI_jet_subleading_eta | PRI_jet_subleading_phi | PRI_jet_all_pt | Weight | Label | KaggleSet | KaggleWeight |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Int64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Int64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | String1? | String1? | Float64? | |
| 1 | 100000 | 138.47 | 51.655 | 97.827 | 27.98 | 0.91 | 124.711 | 2.666 | 3.064 | 41.928 | 197.76 | 1.582 | 1.396 | 0.2 | 32.638 | 1.017 | 0.381 | 51.626 | 2.273 | -2.414 | 16.824 | -0.277 | 258.733 | 2 | 67.435 | 2.15 | 0.444 | 46.062 | 1.24 | -2.475 | 113.497 | 0.00081448 | s | t | 0.00265331 |
| 2 | 100001 | 160.937 | 68.768 | 103.235 | 48.146 | -999.0 | -999.0 | -999.0 | 3.473 | 2.078 | 125.157 | 0.879 | 1.414 | -999.0 | 42.014 | 2.039 | -3.011 | 36.918 | 0.501 | 0.103 | 44.704 | -1.916 | 164.546 | 1 | 46.226 | 0.725 | 1.158 | -999.0 | -999.0 | -999.0 | 46.226 | 0.681042 | b | t | 2.23358 |
| 3 | 100002 | -999.0 | 162.172 | 125.953 | 35.635 | -999.0 | -999.0 | -999.0 | 3.148 | 9.336 | 197.814 | 3.776 | 1.414 | -999.0 | 32.154 | -0.705 | -2.093 | 121.409 | -0.953 | 1.052 | 54.283 | -2.186 | 260.414 | 1 | 44.251 | 2.053 | -2.028 | -999.0 | -999.0 | -999.0 | 44.251 | 0.715742 | b | t | 2.34739 |
| 4 | 100003 | 143.905 | 81.417 | 80.943 | 0.414 | -999.0 | -999.0 | -999.0 | 3.31 | 0.414 | 75.968 | 2.354 | -1.285 | -999.0 | 22.647 | -1.655 | 0.01 | 53.321 | -0.522 | -3.1 | 31.082 | 0.06 | 86.062 | 0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -0.0 | 1.66065 | b | t | 5.44638 |
| 5 | 100004 | 175.864 | 16.915 | 134.805 | 16.405 | -999.0 | -999.0 | -999.0 | 3.891 | 16.405 | 57.983 | 1.056 | -1.385 | -999.0 | 28.209 | -2.197 | -2.231 | 29.774 | 0.798 | 1.569 | 2.723 | -0.871 | 53.131 | 0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 | 1.90426 | b | t | 6.24533 |
| 6 | 100005 | 89.744 | 13.55 | 59.149 | 116.344 | 2.636 | 284.584 | -0.54 | 1.362 | 61.619 | 278.876 | 0.588 | 0.479 | 0.975 | 53.651 | 0.371 | 1.329 | 31.565 | -0.884 | 1.857 | 40.735 | 2.237 | 282.849 | 3 | 90.547 | -2.412 | -0.653 | 56.165 | 0.224 | 3.106 | 193.66 | 0.0254338 | b | t | 0.083414 |
| 7 | 100006 | 148.754 | 28.862 | 107.782 | 106.13 | 0.733 | 158.359 | 0.113 | 2.941 | 2.545 | 305.967 | 3.371 | 1.393 | 0.791 | 28.85 | 1.113 | 2.409 | 97.24 | 0.675 | -0.966 | 38.421 | -1.443 | 294.074 | 2 | 123.01 | 0.864 | 1.45 | 56.867 | 0.131 | -2.767 | 179.877 | 0.00081448 | s | t | 0.00265331 |
| 8 | 100007 | 154.916 | 10.418 | 94.714 | 29.169 | -999.0 | -999.0 | -999.0 | 2.897 | 1.526 | 138.178 | 0.365 | -1.305 | -999.0 | 78.8 | 0.654 | 1.547 | 28.74 | 0.506 | -1.347 | 22.275 | -1.761 | 187.299 | 1 | 30.638 | -0.715 | -1.724 | -999.0 | -999.0 | -999.0 | 30.638 | 0.00572068 | s | t | 0.0186361 |
| 9 | 100008 | 105.594 | 50.559 | 100.989 | 4.288 | -999.0 | -999.0 | -999.0 | 2.904 | 4.288 | 65.333 | 0.675 | -1.366 | -999.0 | 39.008 | 2.433 | -2.532 | 26.325 | 0.21 | 1.884 | 37.791 | 0.024 | 129.804 | 0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 | 1.6148 | b | t | 5.296 |
| 10 | 100009 | 128.053 | 88.941 | 69.272 | 193.392 | -999.0 | -999.0 | -999.0 | 1.609 | 28.859 | 255.123 | 0.599 | 0.538 | -999.0 | 54.646 | -1.533 | 0.416 | 32.742 | -0.317 | -0.636 | 132.678 | 0.845 | 294.741 | 1 | 167.735 | -2.767 | -2.514 | -999.0 | -999.0 | -999.0 | 167.735 | 0.000461025 | s | t | 0.00150187 |
| 11 | 100010 | -999.0 | 86.24 | 79.692 | 27.201 | -999.0 | -999.0 | -999.0 | 2.338 | 27.201 | 81.734 | 1.75 | -1.412 | -999.0 | 29.718 | -0.866 | 2.878 | 52.016 | 0.126 | -1.288 | 51.276 | 0.688 | 250.178 | 0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 | 0.701141 | b | t | 2.2995 |
| 12 | 100011 | 114.744 | 10.286 | 75.712 | 30.816 | 2.563 | 252.599 | -1.401 | 2.888 | 36.745 | 239.804 | 1.061 | 1.364 | 0.769 | 35.976 | -0.669 | -0.342 | 38.188 | -0.165 | 2.502 | 22.385 | 2.148 | 290.547 | 3 | 76.773 | -0.79 | 0.303 | 56.876 | 1.773 | -2.079 | 165.64 | 0.093659 | b | t | 0.30717 |
| 13 | 100012 | 145.297 | 64.234 | 103.565 | 106.999 | -999.0 | -999.0 | -999.0 | 2.183 | 24.66 | 192.245 | 0.576 | 0.689 | -999.0 | 62.89 | -0.766 | -1.632 | 36.237 | 0.722 | -0.035 | 43.91 | -1.907 | 232.362 | 1 | 93.117 | -0.97 | 1.943 | -999.0 | -999.0 | -999.0 | 93.117 | 0.51274 | b | t | 1.68161 |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
| 49989 | 149988 | 91.594 | 70.574 | 57.615 | 34.452 | -999.0 | -999.0 | -999.0 | 2.843 | 3.009 | 89.138 | 1.227 | -1.401 | -999.0 | 25.443 | 1.4 | -1.228 | 31.219 | 1.939 | 2.264 | 45.691 | -0.149 | 111.575 | 1 | 32.476 | -1.057 | 2.877 | -999.0 | -999.0 | -999.0 | 32.476 | 0.617144 | b | t | 2.02402 |
| 49990 | 149989 | -999.0 | 68.957 | 76.191 | 2.304 | -999.0 | -999.0 | -999.0 | 2.32 | 2.304 | 74.611 | 1.072 | -1.414 | -999.0 | 36.01 | -1.651 | -0.452 | 38.601 | -0.594 | -2.517 | 40.181 | 1.633 | 96.391 | 0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 | 1.43486 | b | t | 4.70584 |
| 49991 | 149990 | 118.34 | 51.063 | 51.062 | 116.277 | 1.31 | 238.766 | 4.362 | 1.785 | 2.564 | 292.117 | 2.008 | 1.414 | 0.0 | 23.089 | 0.3 | 2.12 | 46.356 | 0.158 | -2.384 | 71.478 | 2.98 | 338.491 | 2 | 166.372 | -2.844 | -0.148 | 56.3 | -1.534 | 2.692 | 222.672 | 0.00572068 | s | t | 0.0186361 |
| 49992 | 149991 | 131.127 | 58.566 | 85.136 | 151.547 | -999.0 | -999.0 | -999.0 | 1.455 | 11.819 | 268.611 | 0.989 | 1.342 | -999.0 | 64.296 | -1.95 | -2.078 | 63.568 | -2.053 | -0.626 | 57.335 | -1.639 | 355.655 | 1 | 140.746 | 0.593 | 1.712 | -999.0 | -999.0 | -999.0 | 140.746 | 0.000461282 | s | t | 0.0015027 |
| 49993 | 149992 | -999.0 | 95.864 | 90.317 | 21.262 | -999.0 | -999.0 | -999.0 | 2.907 | 21.262 | 64.717 | 2.167 | -1.096 | -999.0 | 20.432 | 0.5 | 2.282 | 44.285 | -1.579 | 0.251 | 51.998 | -2.795 | 184.146 | 0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 | 0.832855 | b | t | 2.73148 |
| 49994 | 149993 | 99.452 | 81.13 | 73.539 | 70.79 | -999.0 | -999.0 | -999.0 | 2.142 | 41.225 | 159.922 | 2.991 | 1.194 | -999.0 | 24.108 | 1.616 | -2.424 | 72.108 | 1.407 | -0.292 | 34.003 | -2.212 | 216.834 | 1 | 63.705 | 2.884 | 2.632 | -999.0 | -999.0 | -999.0 | 63.705 | 0.000461282 | s | t | 0.0015027 |
| 49995 | 149994 | 59.653 | 67.648 | 50.783 | 24.749 | -999.0 | -999.0 | -999.0 | 1.621 | 24.749 | 72.522 | 1.717 | -1.414 | -999.0 | 26.691 | -1.067 | -2.872 | 45.83 | -0.939 | -1.256 | 29.411 | 1.087 | 177.077 | 0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 | 1.41038 | b | t | 4.62555 |
| 49996 | 149995 | 133.013 | 29.198 | 78.723 | 110.576 | 0.666 | 281.596 | -0.073 | 1.96 | 56.406 | 501.854 | 0.418 | 1.398 | 0.0 | 73.174 | 0.819 | 1.32 | 30.581 | 0.952 | -0.636 | 51.207 | 0.12 | 553.857 | 3 | 193.297 | 0.14 | -1.798 | 92.691 | -0.526 | 1.653 | 398.099 | 0.0254338 | b | t | 0.083414 |
| 49997 | 149996 | -999.0 | 125.733 | 58.863 | 174.734 | 0.828 | 171.433 | -0.139 | 1.216 | 34.733 | 305.929 | 2.234 | 0.007 | 0.062 | 30.498 | -0.34 | 1.375 | 68.136 | 0.869 | 1.245 | 134.87 | -0.186 | 409.819 | 2 | 149.11 | -0.234 | -2.235 | 58.184 | 0.594 | 2.188 | 207.294 | 0.720062 | b | t | 2.36156 |
| 49998 | 149997 | 128.498 | 18.588 | 69.903 | 54.601 | 3.932 | 666.91 | -3.56 | 3.025 | 2.339 | 261.035 | 0.844 | -1.398 | 0.559 | 38.094 | -0.936 | -3.106 | 32.158 | -0.948 | 0.152 | 61.561 | -0.269 | 348.625 | 2 | 122.3 | -1.414 | 3.043 | 68.483 | 2.518 | 0.046 | 190.783 | 0.000461282 | s | t | 0.0015027 |
| 49999 | 149998 | 151.113 | 70.106 | 93.991 | 4.145 | -999.0 | -999.0 | -999.0 | 3.4 | 4.145 | 79.236 | 1.711 | 1.405 | -999.0 | 29.225 | 0.746 | 0.521 | 50.01 | 2.074 | -2.609 | 24.589 | 0.476 | 116.539 | 0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | -999.0 | 0.0 | 0.00572068 | s | t | 0.0186361 |
| 50000 | 149999 | 104.21 | 18.268 | 58.438 | 80.275 | -999.0 | -999.0 | -999.0 | 2.135 | 31.245 | 106.091 | 0.851 | 1.368 | -999.0 | 30.645 | 0.859 | 2.111 | 26.094 | -0.484 | 0.451 | 43.201 | 1.002 | 161.614 | 1 | 49.353 | -2.641 | -2.038 | -999.0 | -999.0 | -999.0 | 49.353 | 0.000461282 | s | t | 0.0015027 |
# We define two versions of the function, one for numbers (where '<' is meaningful)
# and a fallback for other column types, that never changes the values
missing_value(v::Number) = if (v===missing || v==-999.0) missing else v end
missing_value(v) = v
missing_value (generic function with 2 methods)
# Use 'reverse' to process the columns right to left and then ultimately preserve the column order
for column in reverse!(names(higgs_ml))
select!(higgs_ml, column => ByRow(missing_value) => column, :)
end
higgs_ml
50000×35 DataFrame
49975 rows omitted
| Row | EventId | DER_mass_MMC | DER_mass_transverse_met_lep | DER_mass_vis | DER_pt_h | DER_deltaeta_jet_jet | DER_mass_jet_jet | DER_prodeta_jet_jet | DER_deltar_tau_lep | DER_pt_tot | DER_sum_pt | DER_pt_ratio_lep_tau | DER_met_phi_centrality | DER_lep_eta_centrality | PRI_tau_pt | PRI_tau_eta | PRI_tau_phi | PRI_lep_pt | PRI_lep_eta | PRI_lep_phi | PRI_met | PRI_met_phi | PRI_met_sumet | PRI_jet_num | PRI_jet_leading_pt | PRI_jet_leading_eta | PRI_jet_leading_phi | PRI_jet_subleading_pt | PRI_jet_subleading_eta | PRI_jet_subleading_phi | PRI_jet_all_pt | Weight | Label | KaggleSet | KaggleWeight |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Int64 | Float64? | Float64 | Float64 | Float64 | Float64? | Float64? | Float64? | Float64 | Float64 | Float64 | Float64 | Float64 | Float64? | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Int64 | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64 | Float64 | String1 | String1 | Float64 | |
| 1 | 100000 | 138.47 | 51.655 | 97.827 | 27.98 | 0.91 | 124.711 | 2.666 | 3.064 | 41.928 | 197.76 | 1.582 | 1.396 | 0.2 | 32.638 | 1.017 | 0.381 | 51.626 | 2.273 | -2.414 | 16.824 | -0.277 | 258.733 | 2 | 67.435 | 2.15 | 0.444 | 46.062 | 1.24 | -2.475 | 113.497 | 0.00081448 | s | t | 0.00265331 |
| 2 | 100001 | 160.937 | 68.768 | 103.235 | 48.146 | missing | missing | missing | 3.473 | 2.078 | 125.157 | 0.879 | 1.414 | missing | 42.014 | 2.039 | -3.011 | 36.918 | 0.501 | 0.103 | 44.704 | -1.916 | 164.546 | 1 | 46.226 | 0.725 | 1.158 | missing | missing | missing | 46.226 | 0.681042 | b | t | 2.23358 |
| 3 | 100002 | missing | 162.172 | 125.953 | 35.635 | missing | missing | missing | 3.148 | 9.336 | 197.814 | 3.776 | 1.414 | missing | 32.154 | -0.705 | -2.093 | 121.409 | -0.953 | 1.052 | 54.283 | -2.186 | 260.414 | 1 | 44.251 | 2.053 | -2.028 | missing | missing | missing | 44.251 | 0.715742 | b | t | 2.34739 |
| 4 | 100003 | 143.905 | 81.417 | 80.943 | 0.414 | missing | missing | missing | 3.31 | 0.414 | 75.968 | 2.354 | -1.285 | missing | 22.647 | -1.655 | 0.01 | 53.321 | -0.522 | -3.1 | 31.082 | 0.06 | 86.062 | 0 | missing | missing | missing | missing | missing | missing | -0.0 | 1.66065 | b | t | 5.44638 |
| 5 | 100004 | 175.864 | 16.915 | 134.805 | 16.405 | missing | missing | missing | 3.891 | 16.405 | 57.983 | 1.056 | -1.385 | missing | 28.209 | -2.197 | -2.231 | 29.774 | 0.798 | 1.569 | 2.723 | -0.871 | 53.131 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | 1.90426 | b | t | 6.24533 |
| 6 | 100005 | 89.744 | 13.55 | 59.149 | 116.344 | 2.636 | 284.584 | -0.54 | 1.362 | 61.619 | 278.876 | 0.588 | 0.479 | 0.975 | 53.651 | 0.371 | 1.329 | 31.565 | -0.884 | 1.857 | 40.735 | 2.237 | 282.849 | 3 | 90.547 | -2.412 | -0.653 | 56.165 | 0.224 | 3.106 | 193.66 | 0.0254338 | b | t | 0.083414 |
| 7 | 100006 | 148.754 | 28.862 | 107.782 | 106.13 | 0.733 | 158.359 | 0.113 | 2.941 | 2.545 | 305.967 | 3.371 | 1.393 | 0.791 | 28.85 | 1.113 | 2.409 | 97.24 | 0.675 | -0.966 | 38.421 | -1.443 | 294.074 | 2 | 123.01 | 0.864 | 1.45 | 56.867 | 0.131 | -2.767 | 179.877 | 0.00081448 | s | t | 0.00265331 |
| 8 | 100007 | 154.916 | 10.418 | 94.714 | 29.169 | missing | missing | missing | 2.897 | 1.526 | 138.178 | 0.365 | -1.305 | missing | 78.8 | 0.654 | 1.547 | 28.74 | 0.506 | -1.347 | 22.275 | -1.761 | 187.299 | 1 | 30.638 | -0.715 | -1.724 | missing | missing | missing | 30.638 | 0.00572068 | s | t | 0.0186361 |
| 9 | 100008 | 105.594 | 50.559 | 100.989 | 4.288 | missing | missing | missing | 2.904 | 4.288 | 65.333 | 0.675 | -1.366 | missing | 39.008 | 2.433 | -2.532 | 26.325 | 0.21 | 1.884 | 37.791 | 0.024 | 129.804 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | 1.6148 | b | t | 5.296 |
| 10 | 100009 | 128.053 | 88.941 | 69.272 | 193.392 | missing | missing | missing | 1.609 | 28.859 | 255.123 | 0.599 | 0.538 | missing | 54.646 | -1.533 | 0.416 | 32.742 | -0.317 | -0.636 | 132.678 | 0.845 | 294.741 | 1 | 167.735 | -2.767 | -2.514 | missing | missing | missing | 167.735 | 0.000461025 | s | t | 0.00150187 |
| 11 | 100010 | missing | 86.24 | 79.692 | 27.201 | missing | missing | missing | 2.338 | 27.201 | 81.734 | 1.75 | -1.412 | missing | 29.718 | -0.866 | 2.878 | 52.016 | 0.126 | -1.288 | 51.276 | 0.688 | 250.178 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | 0.701141 | b | t | 2.2995 |
| 12 | 100011 | 114.744 | 10.286 | 75.712 | 30.816 | 2.563 | 252.599 | -1.401 | 2.888 | 36.745 | 239.804 | 1.061 | 1.364 | 0.769 | 35.976 | -0.669 | -0.342 | 38.188 | -0.165 | 2.502 | 22.385 | 2.148 | 290.547 | 3 | 76.773 | -0.79 | 0.303 | 56.876 | 1.773 | -2.079 | 165.64 | 0.093659 | b | t | 0.30717 |
| 13 | 100012 | 145.297 | 64.234 | 103.565 | 106.999 | missing | missing | missing | 2.183 | 24.66 | 192.245 | 0.576 | 0.689 | missing | 62.89 | -0.766 | -1.632 | 36.237 | 0.722 | -0.035 | 43.91 | -1.907 | 232.362 | 1 | 93.117 | -0.97 | 1.943 | missing | missing | missing | 93.117 | 0.51274 | b | t | 1.68161 |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
| 49989 | 149988 | 91.594 | 70.574 | 57.615 | 34.452 | missing | missing | missing | 2.843 | 3.009 | 89.138 | 1.227 | -1.401 | missing | 25.443 | 1.4 | -1.228 | 31.219 | 1.939 | 2.264 | 45.691 | -0.149 | 111.575 | 1 | 32.476 | -1.057 | 2.877 | missing | missing | missing | 32.476 | 0.617144 | b | t | 2.02402 |
| 49990 | 149989 | missing | 68.957 | 76.191 | 2.304 | missing | missing | missing | 2.32 | 2.304 | 74.611 | 1.072 | -1.414 | missing | 36.01 | -1.651 | -0.452 | 38.601 | -0.594 | -2.517 | 40.181 | 1.633 | 96.391 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | 1.43486 | b | t | 4.70584 |
| 49991 | 149990 | 118.34 | 51.063 | 51.062 | 116.277 | 1.31 | 238.766 | 4.362 | 1.785 | 2.564 | 292.117 | 2.008 | 1.414 | 0.0 | 23.089 | 0.3 | 2.12 | 46.356 | 0.158 | -2.384 | 71.478 | 2.98 | 338.491 | 2 | 166.372 | -2.844 | -0.148 | 56.3 | -1.534 | 2.692 | 222.672 | 0.00572068 | s | t | 0.0186361 |
| 49992 | 149991 | 131.127 | 58.566 | 85.136 | 151.547 | missing | missing | missing | 1.455 | 11.819 | 268.611 | 0.989 | 1.342 | missing | 64.296 | -1.95 | -2.078 | 63.568 | -2.053 | -0.626 | 57.335 | -1.639 | 355.655 | 1 | 140.746 | 0.593 | 1.712 | missing | missing | missing | 140.746 | 0.000461282 | s | t | 0.0015027 |
| 49993 | 149992 | missing | 95.864 | 90.317 | 21.262 | missing | missing | missing | 2.907 | 21.262 | 64.717 | 2.167 | -1.096 | missing | 20.432 | 0.5 | 2.282 | 44.285 | -1.579 | 0.251 | 51.998 | -2.795 | 184.146 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | 0.832855 | b | t | 2.73148 |
| 49994 | 149993 | 99.452 | 81.13 | 73.539 | 70.79 | missing | missing | missing | 2.142 | 41.225 | 159.922 | 2.991 | 1.194 | missing | 24.108 | 1.616 | -2.424 | 72.108 | 1.407 | -0.292 | 34.003 | -2.212 | 216.834 | 1 | 63.705 | 2.884 | 2.632 | missing | missing | missing | 63.705 | 0.000461282 | s | t | 0.0015027 |
| 49995 | 149994 | 59.653 | 67.648 | 50.783 | 24.749 | missing | missing | missing | 1.621 | 24.749 | 72.522 | 1.717 | -1.414 | missing | 26.691 | -1.067 | -2.872 | 45.83 | -0.939 | -1.256 | 29.411 | 1.087 | 177.077 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | 1.41038 | b | t | 4.62555 |
| 49996 | 149995 | 133.013 | 29.198 | 78.723 | 110.576 | 0.666 | 281.596 | -0.073 | 1.96 | 56.406 | 501.854 | 0.418 | 1.398 | 0.0 | 73.174 | 0.819 | 1.32 | 30.581 | 0.952 | -0.636 | 51.207 | 0.12 | 553.857 | 3 | 193.297 | 0.14 | -1.798 | 92.691 | -0.526 | 1.653 | 398.099 | 0.0254338 | b | t | 0.083414 |
| 49997 | 149996 | missing | 125.733 | 58.863 | 174.734 | 0.828 | 171.433 | -0.139 | 1.216 | 34.733 | 305.929 | 2.234 | 0.007 | 0.062 | 30.498 | -0.34 | 1.375 | 68.136 | 0.869 | 1.245 | 134.87 | -0.186 | 409.819 | 2 | 149.11 | -0.234 | -2.235 | 58.184 | 0.594 | 2.188 | 207.294 | 0.720062 | b | t | 2.36156 |
| 49998 | 149997 | 128.498 | 18.588 | 69.903 | 54.601 | 3.932 | 666.91 | -3.56 | 3.025 | 2.339 | 261.035 | 0.844 | -1.398 | 0.559 | 38.094 | -0.936 | -3.106 | 32.158 | -0.948 | 0.152 | 61.561 | -0.269 | 348.625 | 2 | 122.3 | -1.414 | 3.043 | 68.483 | 2.518 | 0.046 | 190.783 | 0.000461282 | s | t | 0.0015027 |
| 49999 | 149998 | 151.113 | 70.106 | 93.991 | 4.145 | missing | missing | missing | 3.4 | 4.145 | 79.236 | 1.711 | 1.405 | missing | 29.225 | 0.746 | 0.521 | 50.01 | 2.074 | -2.609 | 24.589 | 0.476 | 116.539 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | 0.00572068 | s | t | 0.0186361 |
| 50000 | 149999 | 104.21 | 18.268 | 58.438 | 80.275 | missing | missing | missing | 2.135 | 31.245 | 106.091 | 0.851 | 1.368 | missing | 30.645 | 0.859 | 2.111 | 26.094 | -0.484 | 0.451 | 43.201 | 1.002 | 161.614 | 1 | 49.353 | -2.641 | -2.038 | missing | missing | missing | 49.353 | 0.000461282 | s | t | 0.0015027 |
using Plots
using StatsPlots
signal = filter(:Label => l -> l == "s", higgs_ml)
background = filter(:Label => l -> l == "b", higgs_ml)
32935×35 DataFrame
32910 rows omitted
| Row | EventId | DER_mass_MMC | DER_mass_transverse_met_lep | DER_mass_vis | DER_pt_h | DER_deltaeta_jet_jet | DER_mass_jet_jet | DER_prodeta_jet_jet | DER_deltar_tau_lep | DER_pt_tot | DER_sum_pt | DER_pt_ratio_lep_tau | DER_met_phi_centrality | DER_lep_eta_centrality | PRI_tau_pt | PRI_tau_eta | PRI_tau_phi | PRI_lep_pt | PRI_lep_eta | PRI_lep_phi | PRI_met | PRI_met_phi | PRI_met_sumet | PRI_jet_num | PRI_jet_leading_pt | PRI_jet_leading_eta | PRI_jet_leading_phi | PRI_jet_subleading_pt | PRI_jet_subleading_eta | PRI_jet_subleading_phi | PRI_jet_all_pt | Weight | Label | KaggleSet | KaggleWeight |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Int64 | Float64? | Float64 | Float64 | Float64 | Float64? | Float64? | Float64? | Float64 | Float64 | Float64 | Float64 | Float64 | Float64? | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Int64 | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64 | Float64 | String1 | String1 | Float64 | |
| 1 | 100001 | 160.937 | 68.768 | 103.235 | 48.146 | missing | missing | missing | 3.473 | 2.078 | 125.157 | 0.879 | 1.414 | missing | 42.014 | 2.039 | -3.011 | 36.918 | 0.501 | 0.103 | 44.704 | -1.916 | 164.546 | 1 | 46.226 | 0.725 | 1.158 | missing | missing | missing | 46.226 | 0.681042 | b | t | 2.23358 |
| 2 | 100002 | missing | 162.172 | 125.953 | 35.635 | missing | missing | missing | 3.148 | 9.336 | 197.814 | 3.776 | 1.414 | missing | 32.154 | -0.705 | -2.093 | 121.409 | -0.953 | 1.052 | 54.283 | -2.186 | 260.414 | 1 | 44.251 | 2.053 | -2.028 | missing | missing | missing | 44.251 | 0.715742 | b | t | 2.34739 |
| 3 | 100003 | 143.905 | 81.417 | 80.943 | 0.414 | missing | missing | missing | 3.31 | 0.414 | 75.968 | 2.354 | -1.285 | missing | 22.647 | -1.655 | 0.01 | 53.321 | -0.522 | -3.1 | 31.082 | 0.06 | 86.062 | 0 | missing | missing | missing | missing | missing | missing | -0.0 | 1.66065 | b | t | 5.44638 |
| 4 | 100004 | 175.864 | 16.915 | 134.805 | 16.405 | missing | missing | missing | 3.891 | 16.405 | 57.983 | 1.056 | -1.385 | missing | 28.209 | -2.197 | -2.231 | 29.774 | 0.798 | 1.569 | 2.723 | -0.871 | 53.131 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | 1.90426 | b | t | 6.24533 |
| 5 | 100005 | 89.744 | 13.55 | 59.149 | 116.344 | 2.636 | 284.584 | -0.54 | 1.362 | 61.619 | 278.876 | 0.588 | 0.479 | 0.975 | 53.651 | 0.371 | 1.329 | 31.565 | -0.884 | 1.857 | 40.735 | 2.237 | 282.849 | 3 | 90.547 | -2.412 | -0.653 | 56.165 | 0.224 | 3.106 | 193.66 | 0.0254338 | b | t | 0.083414 |
| 6 | 100008 | 105.594 | 50.559 | 100.989 | 4.288 | missing | missing | missing | 2.904 | 4.288 | 65.333 | 0.675 | -1.366 | missing | 39.008 | 2.433 | -2.532 | 26.325 | 0.21 | 1.884 | 37.791 | 0.024 | 129.804 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | 1.6148 | b | t | 5.296 |
| 7 | 100010 | missing | 86.24 | 79.692 | 27.201 | missing | missing | missing | 2.338 | 27.201 | 81.734 | 1.75 | -1.412 | missing | 29.718 | -0.866 | 2.878 | 52.016 | 0.126 | -1.288 | 51.276 | 0.688 | 250.178 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | 0.701141 | b | t | 2.2995 |
| 8 | 100011 | 114.744 | 10.286 | 75.712 | 30.816 | 2.563 | 252.599 | -1.401 | 2.888 | 36.745 | 239.804 | 1.061 | 1.364 | 0.769 | 35.976 | -0.669 | -0.342 | 38.188 | -0.165 | 2.502 | 22.385 | 2.148 | 290.547 | 3 | 76.773 | -0.79 | 0.303 | 56.876 | 1.773 | -2.079 | 165.64 | 0.093659 | b | t | 0.30717 |
| 9 | 100012 | 145.297 | 64.234 | 103.565 | 106.999 | missing | missing | missing | 2.183 | 24.66 | 192.245 | 0.576 | 0.689 | missing | 62.89 | -0.766 | -1.632 | 36.237 | 0.722 | -0.035 | 43.91 | -1.907 | 232.362 | 1 | 93.117 | -0.97 | 1.943 | missing | missing | missing | 93.117 | 0.51274 | b | t | 1.68161 |
| 10 | 100013 | 82.488 | 31.663 | 64.128 | 8.232 | missing | missing | missing | 2.823 | 8.232 | 58.649 | 1.303 | -1.414 | missing | 25.47 | -0.654 | -2.99 | 33.179 | -1.665 | -0.354 | 12.439 | 1.433 | 163.42 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | 0.66589 | b | t | 2.18389 |
| 11 | 100014 | missing | 109.412 | 14.398 | 17.323 | missing | missing | missing | 0.472 | 17.323 | 62.565 | 1.774 | -0.272 | missing | 22.552 | 1.389 | 1.34 | 40.013 | 1.856 | 1.412 | 75.197 | -1.583 | 198.616 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | 0.655922 | b | t | 2.1512 |
| 12 | 100016 | 114.256 | 4.351 | 67.963 | 47.221 | missing | missing | missing | 2.954 | 26.243 | 100.93 | 1.145 | 0.218 | missing | 30.145 | 0.484 | -0.929 | 34.522 | -0.215 | 1.941 | 41.899 | 2.055 | 191.568 | 1 | 36.263 | -0.766 | -0.686 | missing | missing | missing | 36.263 | 0.443598 | b | t | 1.45485 |
| 13 | 100018 | missing | 85.186 | 68.827 | 5.042 | missing | missing | missing | 2.116 | 5.042 | 71.443 | 1.558 | -1.351 | missing | 27.931 | 1.175 | 2.356 | 43.512 | 2.332 | 0.584 | 44.698 | -2.033 | 151.816 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | 1.56163 | b | t | 5.12162 |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
| 32924 | 149980 | missing | 87.304 | 89.604 | 14.486 | missing | missing | missing | 2.252 | 14.486 | 70.913 | 0.597 | -1.411 | missing | 44.39 | 0.683 | -0.507 | 26.523 | -1.394 | 0.364 | 74.833 | 3.103 | 149.571 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | 1.55172 | b | t | 5.08911 |
| 32925 | 149982 | missing | 93.32 | 35.518 | 9.338 | missing | missing | missing | 1.125 | 9.338 | 66.754 | 1.829 | -1.341 | missing | 23.599 | 1.709 | 0.901 | 43.155 | 2.392 | 0.007 | 51.511 | -2.847 | 144.558 | 0 | missing | missing | missing | missing | missing | missing | -0.0 | 0.643797 | b | t | 2.11143 |
| 32926 | 149983 | 225.397 | 83.38 | 134.68 | 70.387 | 0.114 | 62.485 | 3.535 | 3.406 | 38.621 | 237.189 | 0.648 | -0.366 | 0.0 | 53.694 | 0.209 | -1.27 | 34.799 | -1.839 | 1.451 | 50.239 | -1.537 | 253.344 | 3 | 64.015 | 1.938 | 1.029 | 47.673 | 1.824 | 2.178 | 148.696 | 0.334829 | b | t | 1.09813 |
| 32927 | 149984 | 93.899 | 18.141 | 66.71 | 39.914 | missing | missing | missing | 2.932 | 1.387 | 109.369 | 1.912 | -1.388 | missing | 24.206 | 2.3 | 1.188 | 46.27 | 2.179 | -2.166 | 20.348 | -1.565 | 84.865 | 1 | 38.893 | 1.232 | 1.161 | missing | missing | missing | 38.893 | 0.51274 | b | t | 1.68161 |
| 32928 | 149985 | 94.095 | 69.654 | 63.508 | 374.919 | 0.649 | 292.152 | 1.067 | 0.498 | 75.552 | 768.408 | 1.56 | 0.946 | 0.0 | 102.883 | -0.481 | 0.744 | 160.477 | -0.334 | 1.22 | 122.771 | 0.718 | 799.338 | 3 | 320.452 | 0.758 | -2.373 | 143.898 | 1.407 | -1.119 | 505.049 | 0.0254338 | b | t | 0.083414 |
| 32929 | 149987 | 68.068 | 28.454 | 39.672 | 209.323 | 1.114 | 165.893 | 0.142 | 0.839 | 1.906 | 353.411 | 3.594 | 1.404 | 0.053 | 25.186 | -2.105 | 0.395 | 90.511 | -1.628 | -0.296 | 98.876 | 0.006 | 327.034 | 2 | 190.435 | -1.229 | -3.003 | 47.279 | -0.115 | 1.974 | 237.714 | 0.443598 | b | t | 1.45485 |
| 32930 | 149988 | 91.594 | 70.574 | 57.615 | 34.452 | missing | missing | missing | 2.843 | 3.009 | 89.138 | 1.227 | -1.401 | missing | 25.443 | 1.4 | -1.228 | 31.219 | 1.939 | 2.264 | 45.691 | -0.149 | 111.575 | 1 | 32.476 | -1.057 | 2.877 | missing | missing | missing | 32.476 | 0.617144 | b | t | 2.02402 |
| 32931 | 149989 | missing | 68.957 | 76.191 | 2.304 | missing | missing | missing | 2.32 | 2.304 | 74.611 | 1.072 | -1.414 | missing | 36.01 | -1.651 | -0.452 | 38.601 | -0.594 | -2.517 | 40.181 | 1.633 | 96.391 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | 1.43486 | b | t | 4.70584 |
| 32932 | 149992 | missing | 95.864 | 90.317 | 21.262 | missing | missing | missing | 2.907 | 21.262 | 64.717 | 2.167 | -1.096 | missing | 20.432 | 0.5 | 2.282 | 44.285 | -1.579 | 0.251 | 51.998 | -2.795 | 184.146 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | 0.832855 | b | t | 2.73148 |
| 32933 | 149994 | 59.653 | 67.648 | 50.783 | 24.749 | missing | missing | missing | 1.621 | 24.749 | 72.522 | 1.717 | -1.414 | missing | 26.691 | -1.067 | -2.872 | 45.83 | -0.939 | -1.256 | 29.411 | 1.087 | 177.077 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | 1.41038 | b | t | 4.62555 |
| 32934 | 149995 | 133.013 | 29.198 | 78.723 | 110.576 | 0.666 | 281.596 | -0.073 | 1.96 | 56.406 | 501.854 | 0.418 | 1.398 | 0.0 | 73.174 | 0.819 | 1.32 | 30.581 | 0.952 | -0.636 | 51.207 | 0.12 | 553.857 | 3 | 193.297 | 0.14 | -1.798 | 92.691 | -0.526 | 1.653 | 398.099 | 0.0254338 | b | t | 0.083414 |
| 32935 | 149996 | missing | 125.733 | 58.863 | 174.734 | 0.828 | 171.433 | -0.139 | 1.216 | 34.733 | 305.929 | 2.234 | 0.007 | 0.062 | 30.498 | -0.34 | 1.375 | 68.136 | 0.869 | 1.245 | 134.87 | -0.186 | 409.819 | 2 | 149.11 | -0.234 | -2.235 | 58.184 | 0.594 | 2.188 | 207.294 | 0.720062 | b | t | 2.36156 |
@df signal histogram(:PRI_tau_pt, alpha=0.4, label="Signal - Tau pT")
@df background histogram!(:PRI_tau_pt, alpha=0.4, label="Background - Tau pT")
@df signal histogram(:PRI_lep_pt, alpha=0.4, label="Signal - Lepton pT")
@df background histogram!(:PRI_lep_pt, alpha=0.4, label="Background - Lepton pT")
@df signal scatter(:PRI_tau_eta, :PRI_tau_phi, alpha=0.4, label="Signal")
@df background scatter!(:PRI_tau_eta, :PRI_tau_phi, alpha=0.4, label="Background")
n_signal = size(signal)[1]
@df signal scatter(:PRI_tau_eta[1:50:n_signal], :PRI_tau_phi[1:10:n_signal], alpha=0.4, label="Signal", title="(η, ϕ)")
n_background = size(background)[1]
@df background scatter!(:PRI_tau_eta[1:50:n_background], :PRI_tau_phi[1:10:n_background], alpha=0.4, label="Background")
# Distance measure between pairs of (η, ϕ) vectors
δϕ(ϕ1, ϕ2) = begin
δ = ϕ1 - ϕ2
while δ > pi
δ -= 2π
end
while δ < -pi
δ += 2π
end
δ
end
dist(η1::Number, ϕ1::Number, η2::Number, ϕ2::Number) = sqrt((η1-η2)^2 + δϕ(ϕ1, ϕ2)^2)
dist(η1, ϕ1, η2, ϕ2) = missing
dist (generic function with 2 methods)
select!(signal, :EventId,
[:PRI_tau_eta, :PRI_tau_phi, :PRI_lep_eta, :PRI_lep_phi] => ByRow(dist) => :Tau_lep_distance, r"PRI.*", :)
select!(background, :EventId,
[:PRI_tau_eta, :PRI_tau_phi, :PRI_lep_eta, :PRI_lep_phi] => ByRow(dist) => :Tau_lep_distance, r"PRI.*", :)
32935×36 DataFrame
32910 rows omitted
| Row | EventId | Tau_lep_distance | PRI_tau_pt | PRI_tau_eta | PRI_tau_phi | PRI_lep_pt | PRI_lep_eta | PRI_lep_phi | PRI_met | PRI_met_phi | PRI_met_sumet | PRI_jet_num | PRI_jet_leading_pt | PRI_jet_leading_eta | PRI_jet_leading_phi | PRI_jet_subleading_pt | PRI_jet_subleading_eta | PRI_jet_subleading_phi | PRI_jet_all_pt | DER_mass_MMC | DER_mass_transverse_met_lep | DER_mass_vis | DER_pt_h | DER_deltaeta_jet_jet | DER_mass_jet_jet | DER_prodeta_jet_jet | DER_deltar_tau_lep | DER_pt_tot | DER_sum_pt | DER_pt_ratio_lep_tau | DER_met_phi_centrality | DER_lep_eta_centrality | Weight | Label | KaggleSet | KaggleWeight |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Int64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Int64 | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64 | Float64? | Float64 | Float64 | Float64 | Float64? | Float64? | Float64? | Float64 | Float64 | Float64 | Float64 | Float64 | Float64? | Float64 | String1 | String1 | Float64 | |
| 1 | 100001 | 3.4731 | 42.014 | 2.039 | -3.011 | 36.918 | 0.501 | 0.103 | 44.704 | -1.916 | 164.546 | 1 | 46.226 | 0.725 | 1.158 | missing | missing | missing | 46.226 | 160.937 | 68.768 | 103.235 | 48.146 | missing | missing | missing | 3.473 | 2.078 | 125.157 | 0.879 | 1.414 | missing | 0.681042 | b | t | 2.23358 |
| 2 | 100002 | 3.14797 | 32.154 | -0.705 | -2.093 | 121.409 | -0.953 | 1.052 | 54.283 | -2.186 | 260.414 | 1 | 44.251 | 2.053 | -2.028 | missing | missing | missing | 44.251 | missing | 162.172 | 125.953 | 35.635 | missing | missing | missing | 3.148 | 9.336 | 197.814 | 3.776 | 1.414 | missing | 0.715742 | b | t | 2.34739 |
| 3 | 100003 | 3.30995 | 22.647 | -1.655 | 0.01 | 53.321 | -0.522 | -3.1 | 31.082 | 0.06 | 86.062 | 0 | missing | missing | missing | missing | missing | missing | -0.0 | 143.905 | 81.417 | 80.943 | 0.414 | missing | missing | missing | 3.31 | 0.414 | 75.968 | 2.354 | -1.285 | missing | 1.66065 | b | t | 5.44638 |
| 4 | 100004 | 3.89053 | 28.209 | -2.197 | -2.231 | 29.774 | 0.798 | 1.569 | 2.723 | -0.871 | 53.131 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | 175.864 | 16.915 | 134.805 | 16.405 | missing | missing | missing | 3.891 | 16.405 | 57.983 | 1.056 | -1.385 | missing | 1.90426 | b | t | 6.24533 |
| 5 | 100005 | 1.36155 | 53.651 | 0.371 | 1.329 | 31.565 | -0.884 | 1.857 | 40.735 | 2.237 | 282.849 | 3 | 90.547 | -2.412 | -0.653 | 56.165 | 0.224 | 3.106 | 193.66 | 89.744 | 13.55 | 59.149 | 116.344 | 2.636 | 284.584 | -0.54 | 1.362 | 61.619 | 278.876 | 0.588 | 0.479 | 0.975 | 0.0254338 | b | t | 0.083414 |
| 6 | 100008 | 2.90312 | 39.008 | 2.433 | -2.532 | 26.325 | 0.21 | 1.884 | 37.791 | 0.024 | 129.804 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | 105.594 | 50.559 | 100.989 | 4.288 | missing | missing | missing | 2.904 | 4.288 | 65.333 | 0.675 | -1.366 | missing | 1.6148 | b | t | 5.296 |
| 7 | 100010 | 2.33806 | 29.718 | -0.866 | 2.878 | 52.016 | 0.126 | -1.288 | 51.276 | 0.688 | 250.178 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | missing | 86.24 | 79.692 | 27.201 | missing | missing | missing | 2.338 | 27.201 | 81.734 | 1.75 | -1.412 | missing | 0.701141 | b | t | 2.2995 |
| 8 | 100011 | 2.88831 | 35.976 | -0.669 | -0.342 | 38.188 | -0.165 | 2.502 | 22.385 | 2.148 | 290.547 | 3 | 76.773 | -0.79 | 0.303 | 56.876 | 1.773 | -2.079 | 165.64 | 114.744 | 10.286 | 75.712 | 30.816 | 2.563 | 252.599 | -1.401 | 2.888 | 36.745 | 239.804 | 1.061 | 1.364 | 0.769 | 0.093659 | b | t | 0.30717 |
| 9 | 100012 | 2.18279 | 62.89 | -0.766 | -1.632 | 36.237 | 0.722 | -0.035 | 43.91 | -1.907 | 232.362 | 1 | 93.117 | -0.97 | 1.943 | missing | missing | missing | 93.117 | 145.297 | 64.234 | 103.565 | 106.999 | missing | missing | missing | 2.183 | 24.66 | 192.245 | 0.576 | 0.689 | missing | 0.51274 | b | t | 1.68161 |
| 10 | 100013 | 2.82323 | 25.47 | -0.654 | -2.99 | 33.179 | -1.665 | -0.354 | 12.439 | 1.433 | 163.42 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | 82.488 | 31.663 | 64.128 | 8.232 | missing | missing | missing | 2.823 | 8.232 | 58.649 | 1.303 | -1.414 | missing | 0.66589 | b | t | 2.18389 |
| 11 | 100014 | 0.472518 | 22.552 | 1.389 | 1.34 | 40.013 | 1.856 | 1.412 | 75.197 | -1.583 | 198.616 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | missing | 109.412 | 14.398 | 17.323 | missing | missing | missing | 0.472 | 17.323 | 62.565 | 1.774 | -0.272 | missing | 0.655922 | b | t | 2.1512 |
| 12 | 100016 | 2.9539 | 30.145 | 0.484 | -0.929 | 34.522 | -0.215 | 1.941 | 41.899 | 2.055 | 191.568 | 1 | 36.263 | -0.766 | -0.686 | missing | missing | missing | 36.263 | 114.256 | 4.351 | 67.963 | 47.221 | missing | missing | missing | 2.954 | 26.243 | 100.93 | 1.145 | 0.218 | missing | 0.443598 | b | t | 1.45485 |
| 13 | 100018 | 2.11628 | 27.931 | 1.175 | 2.356 | 43.512 | 2.332 | 0.584 | 44.698 | -2.033 | 151.816 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | missing | 85.186 | 68.827 | 5.042 | missing | missing | missing | 2.116 | 5.042 | 71.443 | 1.558 | -1.351 | missing | 1.56163 | b | t | 5.12162 |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
| 32924 | 149980 | 2.25224 | 44.39 | 0.683 | -0.507 | 26.523 | -1.394 | 0.364 | 74.833 | 3.103 | 149.571 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | missing | 87.304 | 89.604 | 14.486 | missing | missing | missing | 2.252 | 14.486 | 70.913 | 0.597 | -1.411 | missing | 1.55172 | b | t | 5.08911 |
| 32925 | 149982 | 1.12504 | 23.599 | 1.709 | 0.901 | 43.155 | 2.392 | 0.007 | 51.511 | -2.847 | 144.558 | 0 | missing | missing | missing | missing | missing | missing | -0.0 | missing | 93.32 | 35.518 | 9.338 | missing | missing | missing | 1.125 | 9.338 | 66.754 | 1.829 | -1.341 | missing | 0.643797 | b | t | 2.11143 |
| 32926 | 149983 | 3.4056 | 53.694 | 0.209 | -1.27 | 34.799 | -1.839 | 1.451 | 50.239 | -1.537 | 253.344 | 3 | 64.015 | 1.938 | 1.029 | 47.673 | 1.824 | 2.178 | 148.696 | 225.397 | 83.38 | 134.68 | 70.387 | 0.114 | 62.485 | 3.535 | 3.406 | 38.621 | 237.189 | 0.648 | -0.366 | 0.0 | 0.334829 | b | t | 1.09813 |
| 32927 | 149984 | 2.93168 | 24.206 | 2.3 | 1.188 | 46.27 | 2.179 | -2.166 | 20.348 | -1.565 | 84.865 | 1 | 38.893 | 1.232 | 1.161 | missing | missing | missing | 38.893 | 93.899 | 18.141 | 66.71 | 39.914 | missing | missing | missing | 2.932 | 1.387 | 109.369 | 1.912 | -1.388 | missing | 0.51274 | b | t | 1.68161 |
| 32928 | 149985 | 0.498182 | 102.883 | -0.481 | 0.744 | 160.477 | -0.334 | 1.22 | 122.771 | 0.718 | 799.338 | 3 | 320.452 | 0.758 | -2.373 | 143.898 | 1.407 | -1.119 | 505.049 | 94.095 | 69.654 | 63.508 | 374.919 | 0.649 | 292.152 | 1.067 | 0.498 | 75.552 | 768.408 | 1.56 | 0.946 | 0.0 | 0.0254338 | b | t | 0.083414 |
| 32929 | 149987 | 0.839649 | 25.186 | -2.105 | 0.395 | 90.511 | -1.628 | -0.296 | 98.876 | 0.006 | 327.034 | 2 | 190.435 | -1.229 | -3.003 | 47.279 | -0.115 | 1.974 | 237.714 | 68.068 | 28.454 | 39.672 | 209.323 | 1.114 | 165.893 | 0.142 | 0.839 | 1.906 | 353.411 | 3.594 | 1.404 | 0.053 | 0.443598 | b | t | 1.45485 |
| 32930 | 149988 | 2.84275 | 25.443 | 1.4 | -1.228 | 31.219 | 1.939 | 2.264 | 45.691 | -0.149 | 111.575 | 1 | 32.476 | -1.057 | 2.877 | missing | missing | missing | 32.476 | 91.594 | 70.574 | 57.615 | 34.452 | missing | missing | missing | 2.843 | 3.009 | 89.138 | 1.227 | -1.401 | missing | 0.617144 | b | t | 2.02402 |
| 32931 | 149989 | 2.3198 | 36.01 | -1.651 | -0.452 | 38.601 | -0.594 | -2.517 | 40.181 | 1.633 | 96.391 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | missing | 68.957 | 76.191 | 2.304 | missing | missing | missing | 2.32 | 2.304 | 74.611 | 1.072 | -1.414 | missing | 1.43486 | b | t | 4.70584 |
| 32932 | 149992 | 2.90641 | 20.432 | 0.5 | 2.282 | 44.285 | -1.579 | 0.251 | 51.998 | -2.795 | 184.146 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | missing | 95.864 | 90.317 | 21.262 | missing | missing | missing | 2.907 | 21.262 | 64.717 | 2.167 | -1.096 | missing | 0.832855 | b | t | 2.73148 |
| 32933 | 149994 | 1.62106 | 26.691 | -1.067 | -2.872 | 45.83 | -0.939 | -1.256 | 29.411 | 1.087 | 177.077 | 0 | missing | missing | missing | missing | missing | missing | 0.0 | 59.653 | 67.648 | 50.783 | 24.749 | missing | missing | missing | 1.621 | 24.749 | 72.522 | 1.717 | -1.414 | missing | 1.41038 | b | t | 4.62555 |
| 32934 | 149995 | 1.96052 | 73.174 | 0.819 | 1.32 | 30.581 | 0.952 | -0.636 | 51.207 | 0.12 | 553.857 | 3 | 193.297 | 0.14 | -1.798 | 92.691 | -0.526 | 1.653 | 398.099 | 133.013 | 29.198 | 78.723 | 110.576 | 0.666 | 281.596 | -0.073 | 1.96 | 56.406 | 501.854 | 0.418 | 1.398 | 0.0 | 0.0254338 | b | t | 0.083414 |
| 32935 | 149996 | 1.21597 | 30.498 | -0.34 | 1.375 | 68.136 | 0.869 | 1.245 | 134.87 | -0.186 | 409.819 | 2 | 149.11 | -0.234 | -2.235 | 58.184 | 0.594 | 2.188 | 207.294 | missing | 125.733 | 58.863 | 174.734 | 0.828 | 171.433 | -0.139 | 1.216 | 34.733 | 305.929 | 2.234 | 0.007 | 0.062 | 0.720062 | b | t | 2.36156 |
@df signal histogram(:Tau_lep_distance, alpha=0.4, label="Signal", title="τ - lepton distance")
@df background histogram!(:Tau_lep_distance, alpha=0.4, label="Background")